The monitoring and management of high-volume feature-rich traffic in largenetworks offers significant challenges in storage, transmission andcomputational costs. The predominant approach to reducing these costs is basedon performing a linear mapping of the data to a low-dimensional subspace suchthat a certain large percentage of the variance in the data is preserved in thelow-dimensional representation. This variance-based subspace approach todimensionality reduction forces a fixed choice of the number of dimensions, isnot responsive to real-time shifts in observed traffic patterns, and isvulnerable to normal traffic spoofing. Based on theoretical insights proved inthis paper, we propose a new distance-based approach to dimensionalityreduction motivated by the fact that the real-time structural differencesbetween the covariance matrices of the observed and the normal traffic is morerelevant to anomaly detection than the structure of the training data alone.Our approach, called the distance-based subspace method, allows a differentnumber of reduced dimensions in different time windows and arrives at only thenumber of dimensions necessary for effective anomaly detection. We presentcentralized and distributed versions of our algorithm and, using simulation onreal traffic traces, demonstrate the qualitative and quantitative advantages ofthe distance-based subspace approach.
展开▼